AgreementMakerLight results for OAEI 2013

نویسندگان

Daniel Faria

Catia Pesquita

Emanuel Santos

Isabel F. Cruz

Francisco M. Couto

چکیده

AgreementMakerLight (AML) is an automated ontology matching framework based on element-level matching and the use of external resources as background knowledge. This paper describes the configuration of AML for the OAEI 2013 competition and discusses its results. Being a newly developed and still incomplete system, our focus in this year’s OAEI were the anatomy and large biomedical ontologies tracks, wherein background knowledge plays a critical role. Nevertheless, AML was fairly successful in other tracks as well, showing that in many ontology matching tasks, a lightweight approach based solely on element-level matching can compete with more complex approaches. 1 Presentation of the system 1.1 State, purpose, general statement AgreementMakerLight (AML) is an automated ontology matching framework derived from the AgreementMaker system [2, 4]. It was developed with the main goal of tackling very large ontology matching problems such as those in the life science domain, which AgreementMaker cannot handle efficiently. The key design principles of AML were efficiency and simplicity, although flexibility and extensibility—which are key features of AgreementMaker—were also high on the list [5]. Additionally, AML drew upon the knowledge accumulated in AgreementMaker by reusing, adapting, and building upon many of its components. Finally, one of the main paradigms of AML is the use of external resources as background knowledge in ontology matching. AML is primarily focused on lexically rich ontologies in general and on life sciences ontologies in particular, although it can be adapted to many other ontology matching tasks, thanks to its flexible and extensible framework. However, due to its short development time (eight months), it does not include components for instance matching or translation yet, and thus cannot handle all ontology matching tasks. 1.2 Specific techniques used The AML workflow for the OAEI 2013 can be divided into six steps, as shown in Fig. 1: ontology loading, baseline matching and profiling, background knowledge matching (optional), extension matching and selection, property matching (conditional), and repair (optional). Fig. 1. The AgreementMakerLight Workflow for the OAEI 2013. Ontology Loading In the ontology loading step, AML reads and processes each of the input ontologies and stores the information necessary for the subsequent steps in its own data structures. First, AML reads the localName, labels and synonym properties of all classes, normalizes them, and enters them into the Lexicon [5] of that ontology. Then, it derives new synonyms for each name in the Lexicon by removing leading and trailing stop words [8], and by removing name sections within parenthesis. After class names, AML reads the class-subclass relationships and the disjoint clauses and stores them in the RelationshipMap [5]. Finally, AML reads the name, type, domain, and range of each property and stores them in the PropertyList. Note that AML currently does not store or use comments, definitions, or instances. Baseline Matching and Profiling In the baseline matching and profiling step, AML employs an efficient weighted string-equivalence algorithm, the Lexical Matcher [5], to obtain a baseline class alignment between the input ontologies. Then, AML profiles the matching problem by assessing the size (i.e., number of classes) of the input ontologies, the cardinality of the baseline alignment, and the property/class ratio. Regarding size, AML divides matching problems into three size categories (small, medium or large), which will affect decisions and thresholds during the background knowledge matching and the extension matching and selection steps. Regarding cardinality, AML also considers three categories (near-one, medium and high), which will determine how selection is performed during the extension matching and selection step. As for the property/class ratio, it determines whether AML will match properties during the property matching step. Background Knowledge Matching For the OAEI 2013, AML employs three sources of background knowledge: Uberon [6], UMLS [1] and WordNet [10]. When using background knowledge, AML tests how well each source fits the matching problem by comparing the coverage of its alignment with the coverage of the baseline alignment. The Uberon Matcher uses the Uberon ontology (in OWL) and a table of pre-processed Uberon cross-references (in a text file). Each input ontology is matched both against the Uberon ontology using the Lexical Matcher and directly against the cross-reference table, and AML determines which form of matching is best (giving priority to the crossreferences, since they are more reliable). When Uberon is a good fit for the matching problem, it is selected as the only source of background knowledge and is used to extend the Lexicons of the input ontologies [8]. When it is a reasonable fit, its alignment is merged with the baseline alignment. The UMLS Matcher uses a pre-processed version of the MRCONSO table from the UMLS Metathesaurus (in a text file). Each input ontology is matched against the whole UMLS table, then AML decides whether to use a single UMLS source (by comparing the coverage of all sources) or the whole table. When UMLS is a good fit for the matching problem, its alignment is used exclusively, and the extension matching and selection step is skipped. Otherwise, if it is a reasonable fit, its alignment is merged with the baseline alignment. The WordNet Matcher queries the WordNet database for synonyms of each name in the Lexicons of the input ontologies, using the Jaws API CITATION. These synonyms are used to create temporary extended Lexicons, which are matched with the Lexical Matcher. Because WordNet is prone to induce errors, AML uses it only to extend the baseline alignment, meaning that it matches only previously unmatched classes. Extension Matching and Selection The extension matching and selection step comprises two matching sub-steps that alternate with two selection sub-steps. First, AML employs a word-based similarity algorithm, the Word Matcher [5], to extend the current alignment globally, followed by a selection algorithm to reduce the alignment to the desired cardinality. Then AML employs the Parametric String Matcher [5], which implements the Isub string similarity metric [11], to extend the resulting alignment locally (i.e., by matching the children, parents and siblings of already matched class pairs). This is followed by a final selection sub-step. When the matching problem is profiled as ’large’, the Word Matcher is skipped because it is too memory intensive to be used globally, and its local use is subsumed by that of the Parametric String Matcher [3]. In the interactive matching track, AML employs an interactive selection algorithm, which asks the user for feedback about mappings in case of conflict or below a given similarity threshold, until a given number of negative answers is reached. Property Matching In the property matching step, AML matches the ontology properties. AML compares the properties’ types, domains and ranges, looking for mappings in the class alignment when the domains/ranges are classes. Then, if the properties have attributes in common, AML measures the word-based similarity between their names (as per the Word Matcher [5]), employing also WordNet when background knowledge is turned on. Repair In the repair step, AML employs a heuristic repair algorithm [9] to ensure that the final alignment is coherent with regard to disjoint clauses. The repair algorithm was used by default in all OAEI tracks, except for the Large Biomedical Ontologies track where we ran AML both with and without repair. 1.3 Link to the system and parameters file The AML system and the alignments it produced for the OAEI 2013 are available at the SOMER project page (http://somer.fc.ul.pt/).

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

The AgreementMakerLight Ontology Matching System

AgreementMaker is one of the leading ontology matching systems, thanks to its combination of a flexible and extensible framework with a comprehensive user interface. In many domains, such as the biomedical, ontologies are becoming increasingly large thus presenting new challenges. We have developed a new core framework, AgreementMakerLight, focused on computational efficiency and designed to ha...

متن کامل

OAEI 2016 results of AML

AgreementMakerLight (AML) is an automated ontology matching system based primarily on element-level matching and on the use of external resources as background knowledge. This paper describes its configuration for the OAEI 2016 competition and discusses its results. For this OAEI edition, we tackled instance matching for the first time, thus expanding the coverage of AML to all types of ontolog...

متن کامل

Results of AML in OAEI 2017

AgreementMakerLight (AML) is an automated ontology matching system that was developed with both extensibility and efficiency in mind. This paper describes its configuration for the OAEI 2017 competition and discusses its results. For this OAEI edition, we built upon the instance matching foundations we laid last year, and tackled the new Hobbit track and its new evaluation platform. AML was the...

متن کامل

AML results for OAEI 2015

AgreementMakerLight (AML) is an automated ontology matching system based primarily on element-level matching and on the use of external resources as background knowledge. This paper describes its configuration for the OAEI 2015 competition and discusses its results. For this OAEI edition, we focused mainly on the Interactive Matching track due to its expansion, as handling user interactions on ...

متن کامل